Topics Over Nonparametric Time: A Supervised Topic Model Using Bayesian Nonparametric Density Estimation
نویسندگان
چکیده
We propose a new supervised topic model that uses a nonparametric density estimator to model the distribution of real-valued metadata given a topic. The model is similar to Topics Over Time, but replaces the beta distributions used in that model with a Dirichlet process mixture of normals. The use of a nonparametric density estimator allows for the fitting of a greater class of metadata densities. We compare our model with existing supervised topic models in terms of prediction and show that it is capable of discovering complex metadata distributions in both synthetic and real data.
منابع مشابه
Using POMDPs to Forecast Kindergarten Students' Reading Comprehension
Using POMDPs to Forecast Kindergarten Students’ Reading Comprehension . . . . . . . . . . . . . . . . . . . . 1 Russell Almond, Umit Tokac and Stephanie Al Ortaiba High-Level Information Fusion with Bayesian Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Paulo Costa, Kathryn Laskey, Kuochu Chang, Wei Sun, Cheol Park and Shou Matsumoto Goal-Based Person T...
متن کاملKernel Methods for Nonparametric Bayesian Inference of Probability Densities and Point Processes
Nonparametric kernel methods for estimation of probability densities and point process intensities have long been of interest to researchers in statistics and machine learning. Frequentist kernel methods are widely used, but provide only a point estimate of the unknown density. Additionally, in frequentist kernel density methods, it can be difficult to select appropriate kernel parameters. The ...
متن کاملBayesian Nonparametric Collaborative Topic Poisson Factorization for Electronic Health Records-Based Phenotyping
Phenotyping with electronic health records (EHR) has received much attention in recent years because the phenotyping opens a new way to discover clinically meaningful insights, such as disease progression and disease subtypes without human supervisions. In spite of its potential benefits, the complex nature of EHR often requires more sophisticated methodologies compared with traditional methods...
متن کاملDiscovering shared and individual latent structure in multiple time series
This paper proposes a nonparametric Bayesian method for exploratory data analysis and feature construction in continuous time series. Our method focuses on understanding shared features in a set of time series that exhibit significant individual variability. Our method builds on the framework of latent Diricihlet allocation (LDA) and its extension to hierarchical Dirichlet processes, which allo...
متن کاملBayesian multivariate density estimation for observables and random effects
Multivariate density estimation is approached using Bayesian nonparametric mixture of normals models. Two models are developed which are both centred over a multivariate normal distribution but make different prior assumptions about how the unknown distribution departs from a normal distribution. The priors are applied to density estimation of both observables and random effects (or other unobs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012